431 Class 01

Thomas E. Love, Ph.D.

2023-08-29

Getting To These Slides

Our web site: https://thomaselove.github.io/431-2023/

Visit the Course Calendar at the top of the page, which will take you to the Class 01 README page.

  • These Slides for Class 01 are linked at the Class 01 README.
    • We’ll look at the HTML slides during class.
    • We also provide the Quarto code I used to build the slides.

This is PQHS / CRSP / MPHP 431

Correlation

First Activity

First Thing

Write down your guess of Dr. Love’s age in years in the appropriate spot on the convenient piece of paper we’ve provided. Hang on to the paper, as you’ll need it again later.

Here’s a picture, in case that’s helpful.

Dr. Love

Course Details

Instructor: Thomas E. Love, Ph.D.

Email: Thomas dot Love at case dot edu

  • (best way to reach me, although you won’t use it much)

Our web site: https://thomaselove.github.io/431-2023/

  • The Contact Us page on our web site provides details on how to get help.
  • If you’ve spent 15 minutes working on something and are stuck, don’t keep working on it. ASK FOR HELP.

Teaching Assistants (Fall 2023)

  • TA Office Hours will begin Tuesday 2023-09-05.

Structure of the 431 Course

  • Exploratory Data Analysis, Visualization
  • Statistical Inference, Making Comparisons
  • Linear Regression and related Models

Course Philosophy (in one slide)

The course is about biostatistics, replicable research, using state-of-the-art tools (R, R Studio, Quarto), and thinking about how science is most effectively done.

  • It is more a course in how to do things (highly applied) rather than a theoretical/mathematical justification for why we do them. We focus here on practical work.
  • It’s mostly about getting you doing data science projects for biological, medical and health applications.

More on all of this in the Course Syllabus, of course.

What is Data Science about?

Source: Figure 1.1 in https://r4ds.hadley.nz/intro.html

What will we be reading?

Tools We Use in 431

  • Calendar (for deadlines and what’s next)
  • Campuswire (for discussion / Q and A outside of class)
  • Canvas (for submission of work)
  • Zoom (for TA office hours and class recordings)
  • Shared Google Drive (mostly Lab/Quiz/Project answer sketches and feedback)
  • Main Website links to everything you’ll need (see next slide)

Our Main Website

The link to this page is at the bottom of every slide.

Keeping Caught Up

  • If you have to miss a class, catch up before the next one.
  • We’ll try to record the classes using Zoom and then make them available afterwards.
    • You’ll find the recordings on Canvas in the Zoom section.
    • We prefer you not to join the Zoom live, but rather watch the recording when it is posted (usually the same day.)
  • Our assignments have deadlines, which are posted to the Calendar, and which we expect you to meet.

Attendance Policies

  1. We expect you to attend 20 of the 26 classes in person, at minimum.
  2. Don’t come if you are sick, please. Watch the recording instead.
  3. If you’re getting over an illness, but are well enough to attend class, please mask up.
  4. If you will need to miss more than two classes in a row, or if you cannot keep up with assignments, that’s when Dr. Love needs to hear with you.

Great Statisticians in History

John Tukey (1915-2000)1

Ten Things To Do After Class 01

  1. Review the main course website, being sure to visit the Course Calendar.
  2. Read through the Course Syllabus.
  3. Obtain David Spiegelhalter’s The Art of Statistics: How to Learn from Data (~$20).
  4. Complete the Welcome to 431 survey.
  5. Install the software you’ll need.
  6. Sign up for Campuswire.
  7. Look at the Course Notes
  8. Bookmark two books, plus the RStudio Cheatsheets (see README).
  9. Ask us questions! Campuswire is available now. TA hours start 2023-09-05.
  10. Take a look at Lab 01 due 2023-09-12, our first substantial assignment.

See the Class 01 README for more details.

We’ll form groups in a moment

  • Shortly, we will be asking you each to join a group, containing five or six people. Join a group where you will meet at least one new person.
  • Also, one member of each group will serve as recorder and will need to open a Google Form on their laptop in a moment.
    • Everyone else needs nothing except their convenient piece of paper and a pen/pencil.
  • Your first task is to settle on a name for your group. Try to be a little creative.

We’ll be guessing some ages

  • I will display a series of 10 photographs, each of a person.
  • For each photo your group will …
    • estimate the age of the person in the photo (in years)
    • have the recorder type your (group) guess into the form (so if you guess age 35, you will just type 35.)
  • When you’ve produced guesses for each of the 10 photos, submit the form. The recorder will get an email confirmation.
  • Later, we’ll reveal the true ages and compute errors.

OK. Let’s form the groups.

  • Remember, your group should have FIVE OR SIX people, with at least one person you don’t already know.
  1. Select a group name.
  2. Select a recorder, who should visit the link below after logging into Google via CWRU.

https://bit.ly/431-2023-class01-breakout

  1. Make sure everyone knows everyone else’s name, as well as your group’s name.

Here come the photos

  • We’ll give you a little more time for the first two photos than the other eight.

Remember to have the recorder fill out the form at https://bit.ly/431-2023-class01-breakout although it may help all of you to keep track on paper, as well.

Photo 1

Photo 2

Making Progress

  • Your group’s guess for each photo should be in the form at https://bit.ly/431-2023-class01-breakout.
    • You might also want to keep track on your convenient piece of paper, so that when I tell you the ages later, you’ll be in a position to see how your group did.
  • In spare time between photos, please make the effort to learn something about each of the other people in your group beyond their name: perhaps what field they are in, or where they come from.

Photo 3

Photo 4

Photo 5

Photo 6

Photo 7

Photo 8

Photo 9

Photo 10

Now, guess My Age again

  1. You should have an initial guess of my age written down from the start of the session.
  2. Now, make a second guess of my age based on what you know about me now, and write that down next to the initial guess.

So if you guessed 18 initially, but now think I’m 19, you should write 18/19. If you still think I’m 18, write 18/18. Make it easy for us to understand your guesses of my age on the convenient piece of paper. Don’t guess my age as a group - just write down your own guess.

Age Guessing Robots?

Well, Microsoft used to have a tool online at how-old.net to do this. There are some related robots that still do the job online, although most people are unwilling to use them.

Do you think you did that well?

OK. Back to the photos!

Card 1

Card 2

Card 3

Card 4

No, not THAT Kevin Love

THIS Kevin Love, on the right (January 2019)

Card 5

Card 6

Card 7

Card 8

Card 9

Card 10

How did the AI do in August 2016?

Some Data from Prior Years

label year age mean_guess error
Chong 2021 21 25.9 4.9
Archuleta 2021 64 53.6 -10.4
Mayfield 2021 28 28.6 0.6
Love 2021 14 14.9 0.9
Chong 2020 21 25.1 4.1
Chong 2019 21 27.9 6.9
Chong 2018 21 24.7 3.7

Scatterplot of Prior Results, 1

Scatterplot of Prior Results, 2

Mean Class-Wide Guesses (2014-21)

Mean Class-Wide Errors (2014-21)

2021 Results: Labeled Scatterplot

Hans Rosling “The Joy of Stats”

200 countries over 200 years using 120,000 numbers, in less than 5 minutes.

And if you liked that …

Thanks for coming!

See you Thursday at 1 PM right here.